Genetic variability of E6 and E7 genes of human papillomavirus type 58 in Jingzhou, Hubei Province of central China

Background Cervical cancer is a common malignant tumor in women, with a high mortality rate, has great harm to women’s health. Long-term and persistent infection of high-risk human papillomavirus (HR-HPV) is the main reason of the occurrence and development of cervical cancer. Methods The infection rate of HPV-58 is higher in the Jingzhou area. In this study, 172 complete HPV-58 E6-E7 sequences were amplified by polymerase chain reaction (PCR), the amplified products were sequenced, and the gene variations of HPV-58 E6-E7 were analyzed. A Neighbor-Joining phylogenetic tree was constructed by MEGA 11. The secondary structure of E6 and E7 protein was investigated. PAML X was used to analyze the selective pressure. The B cell epitopes of E6 and E7 proteins in HPV-58 were predicted by ABCpred server. Results In E6 sequences, 10 single nucleotide variants were observed, including 7 synonymous and 3 non-synonymous variants. In E7 sequences, 12 single nucleotide variants were found, including 3 synonymous variants and 9 non-synonymous variants. There are 5 novel variants. The phylogenetic analysis showed that all the E6-E7 sequences were distributed in A lineage. No positively selected site was found in E6 sequence, but G63 in E7 sequences was identified as positively selected site. Some amino acid substitutions affected multiple B cell epitopes. Conclusion Various E6 and E7 mutational data may prove useful for development of better diagnostic and vaccines for the region of Jingzhou, Hubei province of central China.


Background
Cervical cancer is a common female malignant tumor. According to the statistic results in 2020 all over the world [1], cervical cancer is the fourth most frequently diagnosed cancer and fourth leading cause of cancer death in women, with an estimated 604,172 new cases and 341,831 deaths worldwide in 2020. That account for 6.5% and 7.7% of new cancer cases and deaths worldwide, respectively. The burden of cervical cancer is higher in developing country, has a tremendous impact on the lifetime of millions of women [2].
Human papillomavirus (HPV) is a spherical and nonenveloped virus with a double-stranded circular DNA genome, showing strong squamous epithelium-like characteristics to infect damaged epithelial cells [3]. More than 200 types of HPV have been fully characterized [4]. HPVs have been divided into low-risk types (HPV-6,11, ect.) and high-risk types (HPV-16, 18, 52, 58, ect.) based on the risk of the virus to cause cancer in the uterine cervix [5].
It has been proved that long-term and persistent infection of HR-HPV is closely related to the occurrence and development of cervical cancer [6]. However, Open Access *Correspondence: meibing008@163.com 1 Department of Laboratory Medicine, Jingzhou Hospital, Yangtze University, Jingzhou 434020, China Full list of author information is available at the end of the article there are significant differences in HPV susceptible types in different regions. In China, HPV-16, HPV-52, and HPV-58 are the most common types, and the infection rate of HPV-58 is 15.31% [7], which is significantly higher than the global infection rate (4.7%) [8]. HPV-58 belongs to the Alpha-papillomavirus 9 (α9 HPV), and is relatively prevalent in China and other Asian countries [9]. HPV-58 genome can be divided into early region (E1, E2, E4, E5, E6, and E7), late region (L1 and L2) and long control region (LCR). Two HPV proteins, E6 and E7 (encoded by E6 gene and E7 gene, respectively), are the main drivers of cervical cancer development. The E6 protein binds to a ubiquitin ligase (E6AP) in the cell to form a complex, then binds specifically to p53 tumor suppressor protein. As a result, p53 is quickly degraded by proteases, resulting in decreasing the content of P53. E7 protein interacts with the retinoblastoma gene product (pRb), to inhibit the activity of the tumor suppressor protein through ubiquitination [10]. When HPV integrates into the host cell, E6 and E7 are invariably reserved and uncontrolled expressed, leading to the cell immortalization and malignant transformation [11][12][13].
Virus variations may cause differences in immune response and oncogenic potential [14]. There are some indications that the HPV-58 prototype and HPV-58 variants show a difference in their biological and biochemical characteristics to some extent. Several variations of HPV-58 have been reported in Latin America, Japan, southwest and northwest China [15][16][17][18], but few data were available on the variants of HPV-58 E6 and E7 genes in central China. In this study, we aim to analyze the gene polymorphisms and epidemic characteristics of HPV-58 E6 and E7 genes in Jingzhou area of central China and further evaluate the possible influence of E6-E7 sequence variations on immune reaction of antigens through describing the multiple putative B cell epitopes affected by the amino acid changes in the HPV-58 E6-E7. Our results could provide experimental data for the further study of therapeutic vaccines development, epidemiology, and prevention on HPV-58.

Specimen collection
In this study, samples of cervical exfoliated cells were collected from the female patients with possible HR-HPV infection who underwent routine cervical screenings at the Department of Gynecology of Jingzhou Hospital from Spe. 2019 to Oct. 2021. All the samples were stored at 4 °C, and DNA was extracted within 1 day. Informed consent of patients had been obtained, and the study subjects' privacy was protected before sample collection.

Analysis of E6 and E7 genes
The nucleotide variants and amino acid substitutions were analyzed by comparing all sequencing results with HPV-58 reference sequence (D90400) via the DNAstar and NCBI blast.

Secondary structure analysis and selective pressure analysis
The online server PsiPred (http:// bioinf. cs. ucl. ac. uk/ psipr ed/) was used to predict the secondary structure of E6 and E7 amino acid sequences. The selective pressure was analyzed by PAML X software. The nonsynonymous and synonymous nucleotide divergence (dN/dS) was calculated by Codeml program. Using the Bayes Empirical Bayes (BEB) analysis, the sites with the posterior probability ≥ 0.95 were identified as the positively selected sites.

B cell epitope prediction
We predicted linear B cell epitopes of E6 and E7 proteins by online software ABCpred (ABCpred submission page (iiitd.edu.in)) according to the default parameters. The higher the predicted score, the better the affinity of the epitope.

Variation of E6 and E7 genes
There were 319 single positive samples of HPV-58, and some samples were failed amplified due to the low viral load or degradation of viral nucleic acid. Finally, 172 entire E6-E7 genes were successfully amplified and further analyzed. Compared with the reference sequence D90400, the prototype accounted for 18.02% (31/172) of the total sequencing samples, and the remaining 141 sequences were divided into 21 different variant groups named 58HB01-58HB21, which were submitted to the GenBank database and got the accession numbers were OL989036 to OL989056.
A total of 22 single nucleotide polymorphisms (SNPs) were observed in E6-E7 sequences. In E6 sequences, 10 SNPs were found, including 7 synonymous variants and 3 non-synonymous variants. The prevalent variants  Table 2.

Selective pressure analysis
The variable dN/dS rate ratios were tested among the various lineages using the PAML X. In HPV-58 E6 sequence, there was no positively selected site. In E7, codon site G63 was identified as the positively selected site. All the results are shown in Table 3.

B cell epitope prediction of HPV-58 E6 and E7 proteins
To evaluate the influence of HPV-58 E6 and E7 sequence polymorphisms on the host immune response and recognition, online server ABCpred was used to predict the potential B cell epitopes of E6 and E7 proteins. The epitope prediction scores should be greater than 0.85. The results are shown in

Discussion
Persistent infection of HR-HPV is the main factor for the occurrence and development of cervical cancer. The E6 and E7 proteins encode by E6 and E7 genes are the main carcinogenic proteins and play an important role in the In this present study, we found that HPV-52, 58, and 16 types were prevalent types in Jingzhou area. The infection rate of HPV-58 was 17.00%, ranked second. This unusual high prevalence has also been reported in Chinese populations in Xi'an, Guangxi, and Chongqing [20][21][22].The prevalence of HPV-58 reached a peak when the patients were older than 67 years old in this study, suggesting that more attention should be paid to HPV screening in middle-aged women, which provided a basis for prevention, screening, and treatment of cervical cancer in Jingzhou area.
The results of E6-E7 sequencing showed that the sequence variability of E7 sequence (4.0%) was higher than that of E6 sequence (2.2%), suggesting that E6 may be a more suitable target for HPV therapeutic vaccine than E7. In E6 gene, C307T and A388C (K93N) were common variants. In E7 gene, T744G was most prevalent synonymous variant, G761A (G63D), G694A (G41R), and T803C (V77A) were common non-synonymous variants. There were four non-synonymous variants located in the sequences encoding the secondary structure of E6 and E7 proteins, may cause changes in oncoprotein folding and Table 2 The variations of HPV-58 E6-E7 gene The nucleotides matching the reference (GenBank: D90400) are marked with a dash (-), AA, amino acid; S, strand; Position  1  3  4  5  5  6  6  7  9  1 8  9  2  4  5  5  6  6  6  6  7  7   5  6  0  0  1  6  7  7  3  0  0  1  1  7  1  3  3  4  4  function, led to the differences in their ability to interact with tumor suppressor proteins, thus affecting the pathogenicity of HPV-58 [23]. Among the 21 variant groups, 58HB04, with the C307T/A694G/T744G/G761A variant, was most prevalent (43/172, 25.00%), which is consistent with a previous study in Colombian [24]. At present, there have been some researches on HPV-58 E6-E7 gene polymorphisms, and the common variants of E6 and E7 also have a high frequency in other areas [15,16,[25][26][27]. Some studies have examined the HPV-58 E6 and E7 genetic diversity and their risk association with the development of cervical lesion. The A388C (K93N) substitution in E6 is more prevalent in women with normal cervix or low-grade lesions, shows a statistically significant negative trend of association with the severity of neoplasia in Hong Kong [28], which is consistent with the results of the study on HPV-58 in Shanghai area that A388C (K93N) substitution in E6 significantly reduces the risk of high grade squamous intraepithelial lesions (HSIL) [29]. In E7, the occurrence of C632T (T20I) and G760A (G63S) variants are positively correlated with the severity of neoplasia in Hong Kong [28], which is also found in a study in Zhejiang by Ding et al. [30]. The exact mechanism for the increased oncogenicity of the variants needs further investigation. G261A (R51K), A310G, A340G, T407G (L100V) of E6 gene and A597G of E7 gene were newly identified variants in this study, which have not been reported in previous studies, may provide real and effective data for the development of regional therapeutic vaccines. Our study can provide basic data for further studies of the relationship between E6-E7 gene variations and the severity of the cervix lesion. HPV-58 type has four lineages of A, B, C and D. Lineage A is significantly more frequent in Asia than in America, lineage B is mainly distributed in America, and lineage C and D seem to be more frequent in Africa [31]. In this study, lineage A was most prevalent in Jingzhou area, especially for sub-lineage A1, which was also prevalent in Yunnan, Sichuan, Zhejiang etc. [16,25,32]. The higher oncogenicity and infection rate of sub-lineage A1 may explain the reason why HPV-58 has a high incidence of cervical cancer in China.
At present, all HPV vaccines on the market (Cervarix, Gardasil, and Gardasil-9) are prophylactic vaccines based on L1 protein immunogenicity, which play an effective role in preventing HPV infection by producing specific neutralizing antibodies. However, the vaccines do not prevent cancer development in  individuals who are already infected [33]. Therefore, the development of therapeutic vaccines has broad application prospects in the management and control of HPV infection, and HPV E6 and E7 genes are ideal targets for the design of therapeutic vaccines against HPV [34,35]. ABCpred server was used to predict the B cell epitopes of HPV-58 E6 and E7 proteins and analyzed the effect of amino acid changes caused by gene variation on the B cell epitope. In this study, only G694A (G41R) nucleotide substitution increase the epitope score of DEDEIGLDGPDGQAQP (E7 33-48), while the other variations all led to the decrease of epitope score. B cells play a significant role in HPVrelated tumor immunotherapy and response to cervical lesion and cancers caused by HPV [36]. By analyzing the impacts of the occurrence of non-synonymous variants on the affinity of B cell epitopes, we can further understand their influence on the immune function of host cells. The B cell epitopes predicted in this study can be considered as potential candidates for therapeutic vaccines against HPV-58 development in the Chinese population, and the occurrence of amino acid substitution must be considered in the development of vaccine. These predicted epitopes need to be verified in vitro and in vivo experiments in subsequent studies. This study analyzed the positive selection of HPV-58 E6 and E7 sequences in Jingzhou area. The main feature of positive selection is that it causes an unusual repaid increase in the frequency of alleles, which allows a species to adapt quickly to environmental changes [37]. The positively selected sites G63 found in E7 gene may have evolutionary significance in HPV-58 adaption, meaning that this variation is beneficial for the virus to adapt to the changes in the environment. However, relevant research is minimal, and more studies are needed to confirm this hypothesis.

Conclusion
In this study, we analyzed the gene polymorphism of HPV-58 E6-E7 sequence and amino acid substitution of E6-E7 proteins in Jingzhou area, constructed a phylogenetic tree to analyze the lineage characteristics, and predicted the B cell epitope to clarify the regional specificity of HPV-58 gene variation. Some new variations and B cell epitopes were observed in this study, and we also got a general knowledge of the gene polymorphism of HPV-58 in Jingzhou area. This study can supplement the knowledge of gene diversity of HPV-58 in central China, and provide an experimental basis for subsequent studies on epidemiology, evolution, pathogenicity and therapeutic targets of HPV-58.

Funding
This work was supported by Central Funds Guiding the Local Science and Technology Development of Hubei Province, Grant/Award Number: 2019ZYYD066.

Availability of data and materials
The data generated during the current study are available in the NCBI repository (Home-Nucleotide-NCBI (nih.gov)). The sequence data were submitted to GenBank with accession numbers OL989036-OL989056.